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Abstract 

Effective  Command  and  Control  (C2)  requires  the  rapid  comprehension  of  the  identity 
and  other  attributes  of  tracks  and  other  objects  in  three-dimensional  (3-D)  space. 
Advances  in  computing  speed  and  power  are  enabling  display  designers  to  create  real¬ 
time  prototype  3-D  displays  for  this  purpose.  By  3-D  display,  we  mean  a  display  that 
shows  a  perspective  projection  of  all  three  dimensions  of  physical  space  onto  a  flat  CRT. 
One  example  of  a  3-D  prototype  C2  display  is  the  Area  Air  Defense  Commander 
(AADC)  prototype  display  (Dennehy,  Nesbitt  &  Sumey,  1994).  These  new  3-D 
prototypes  are  extremely  compelling.  They  offer  a  radical  increase  in  realism  of  the 
scenes  they  depict  over  existing  2-D  C2  displays.  Their  naturalistic  look  and  easy  feel 
make  them  attractive  to  users  who  consistently  express  a  strong  preference  for  them.  But 
just  because  users  are  clamoring  for  these  3-D  displays  and  because  we  can  now  give 
them  to  them  does  this  mean  that  we  should  advocate  their  ubiquitous  adoption  for  C2? 
The  experimental  literature  comparing  2-D  and  3-D  displays  is  large,  complicated  and 
contradictory,  often  showing  mixed  advantages  for  3-D  displays,  at  best.  The  Navy’s 
Perspective  Display  Technology  (PVT)  project  has  been  conducting  human  factors 
research  addressing  these  issues.  In  this  talk,  an  array  of  PVT’s  experimental  studies  is 
reviewed  that  offer  a  consistent  -  and  often  counter-intuitive  -  set  of  results  and  guidelines 
to  the  where,  what  and  how  of  3-D  perspective  display  use  for  C2  tasks. 


Introduction 

The  ongoing  revolution  in  the  availability  of  inexpensive  and  fast  3-D  rendering 
technologies  is  allowing  display  designers  to  develop  3-D  prototype  displays  C2,  such  as 
the  one  shown  in  Figure  1  (Dennehy,  Nesbitt  &  Sumey,  1994).  By  3-D  display,  we  mean 


a  display  that  shows  a  perspective  view  of  a  scene  on  a  CRT  or  other  flat  computer 
display.  The  image  is  two  dimensional  (2-D),  but  the  oblique  viewing  angle  means  that 
all  three  dimensions  are  projected  and  represented,  to  provide  a  3-D  perspective.  There 
are  various  other  ‘true’  holographic  and  stereoscopic  3-D  displays  under  development 
(e.g.,  Soltan  et  al.,  1998)  but  most  interest  in  3-D  displays  is  in  flat  screen  displays  like 
that  shown  in  Figure  1. 


Figure  1.  Screenshot  from  a  prototype  3-D  perspective  display  for  naval  air  warfare 
(from  Dennehy,  Nesbitt  &  Sumey,  1994). 

There  are  several  reasons  to  suppose  that  3-D  displays  may  be  preferable  to  conventional 

2- D  displays  that  show  an  environment  from  directly  above.  First,  because  our  retinal 
images  are  perspective  projections  of  the  world,  3-D  displays  may  be  inherently  more 
ecologically  plausible  than  2-D  displays.  Similarly,  their  naturalistic  look  has  led  some 

3- D  display  designers  to  suggest  that  they  may  require  “minimal  interpretive  effort” 
(Dennehy,  Nesbitt  &  Sumey,  1994).  Second,  because  3-D  displays  integrate  all  three 
dimensions  of  space  into  a  single  display,  users  may  be  spared  the  mentally  demanding 
process  of  scanning  back  and  forth  to  integrate  two  planar  views  in  order  to  gauge  3-D 
spatial  relationships  (Flaskell  &  Wickens,  1993).  Third,  users  simply  prefer  the 
familiarity  and  easy  feel  of  3-D  displays. 

However,  there  are  counter-arguments  to  each  of  these  points.  First,  if  a  scene  were 
reproduced  with  the  exact  same  fidelity  as  retinal  images  of  that  scene,  those  images 
would  still  need  to  be  interpreted.  A  century  of  perceptual  work  since  Helmholtz  has 
documented  the  difficulties  inherent  in  natural  scene  interpretation.  Second,  the 
compression  of  three  dimensions  onto  a  flat  display  integrates  all  dimensions  but  leaves 
each  one  somewhat  ambiguous  (see  Figure  2).  This  ambiguity,  coupled  with  the 
distortion  of  distances  and  angles  inherent  in  a  perspective  projection  (Sedgewick,  1986), 


makes  3-D  displays  of  questionable  utility  for  precise  relative  position  tasks.  Third, 
basing  display  decisions  on  user  preference  is  not  always  sound  because  users  do  not 
always  want  what  is  best  for  them  (Andre  &  Wickens,  1995). 
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Figure  2.  Viewing  geometry  and  line  of  sight  (LoS)  ambiguities  of  2-D  and  3-D  displays 
(from  Smallman,  Schiller  &  Cowen,  2000).  Position  and  distance  are  ambiguous  along 
the  line  of  sight  in  either  viewing  geometry. 

These  conflicting  arguments  are  complimented  by  a  large  and  ever  growing  literature 
documenting  a  mixed  pattern  of  results  for  2-D  vs.  3-D  display  comparisons  (for  a  recent 
review  and  synthesis,  see  St.  John,  Cowen,  Smallman  &  Oonk,  2001).  The  Navy’s 
Perspective  Display  Technology  (PVT)  project  has  been  conducting  human  factors 
research  addressing  these  issues.  Here,  an  array  of  PVT’s  experimental  studies  is 
reviewed  that  offer  a  consistent  -  and  often  counter-intuitive  -  set  of  results  and  guidelines 
to  the  where,  what  and  how  of  3-D  perspective  display  use  for  C2  tasks. 

Where 


Static  2-D  and  3-D  displays  differ  primarily  in  their  viewpoint  location  (Figure  2).  2-D 
displays  show  the  world  from  a  viewpoint  directly  above,  looking  down  at  90  degrees  to 
the  ground-plane.  3-D  displays  show  the  world  from  above  and  to  the  side,  generally 
between  25  and  45  degrees  to  the  ground-plane.  This  difference  turns  out  to  greatly 
affect  the  ability  of  the  display  to  depict  where  objects  are  in  space.  Unlike  2-D  displays, 
where  only  the  z  axis  (aircraft  altitude)  is  completely  ambiguous,  the  oblique  viewpoint 
of  3-D  displays  makes  all  three  dimensions  somewhat  uncertain.  This  uncertainty, 


coupled  with  the  distortions  of  distances  and  angles  from  perspective  projection,  throws 
into  question  a  user’s  ability  to  spatially  localize  objects  correctly  in  3-D  displays. 

Given  these  comparative  advantages  and  limitations,  the  question  is  when  and  how  to  use 
2-D  and  3-D  displays  effectively.  We  proposed  a  distinction  between  tasks  that  require 
shape  understanding  and  tasks  that  require  precise  judgments  of  relative  position  (St. 
John  et  al.,  2001).  We  hypothesized  that  3-D  views  are  useful  for  understanding  object 
shape,  but  2-D  views  are  more  useful  for  understanding  the  relative  positions  of  objects. 
We  confirmed  these  hypotheses  in  two  experiments  involving  simple  block  shapes.  We 
created  simple  3-D  block  shapes  that  were  rendered  as  a  3-D  perspective  view  or  as  a  set 
of  2-D  views  (see  Figure  3).  Participants  viewed  blocks  in  2-D  or  a  3-D  perspective  view 
and  either  performed  a  shape  understanding  task  (e.g.  identification  or  mental  rotation)  or 
a  relative  position  task  (e.g.  determining  directions  and  distances  between  objects  or 
navigation  between  them).  We  found  that  participants  were  faster  and  more  accurate 
using  the  3-D  views  for  the  shape  understanding  tasks  than  the  2-D  views,  even  when 
blocks  were  rotated  90-degrees.  Conversely,  with  the  same  stimuli,  participants  were 
faster  and  more  accurate  using  the  2-D  views  for  the  relative  position  tasks. 
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Figure  3.  2-D  and  3-D  views  of  an  example  block  and  ball  used  by  St.  John  et  al.  (2001). 


The  block  stimuli  were  chosen  for  their  simplicity  and  generality  to  test  the  hypothesis 
while  minimizing  confounding  variables.  How  might  the  results  generalize  to  more 
complex  and  natural  stimuli  that  are  likely  to  be  shown  in  C2  displays?  To  investigate 
this  issue,  we  extended  our  hypothesis  in  three  experiments  involving  complex  terrain 
(St.  John,  Smallman,  Oonk  &  Cowen,  2000;  St.  John,  Oonk  &  Cowen,  2000). 
Participants  viewed  a  7  by  9  mile  piece  of  terrain  depicted  in  2-D  or  3-D  with  or  without 
shading  and  grids  on  the  ground-plane.  Briefly,  in  one  Terrain  Experiment,  participants 
chose  the  correct  ground-level  view  from  among  four  alternatives  (see  Figure  4).  For  this 
shape  understanding  task,  participants  were  faster  with  the  3-D  views.  In  another 
experiment,  participants  judged  whether  or  not  the  position  of  one  location  was  visible 


from  another  location  or  obstructed  by  intervening  terrain.  These  tasks  both  involved 
shape  understanding  because  it  hinged  on  understanding  the  gross  layout  of  the  terrain. 
Again,  participants  were  faster  with  the  3-D  views.  In  other  Terrain  Experiments, 
participants  judged  which  of  two  locations  was  higher  and  how  to  get  from  one  location 
to  another.  For  these  relative  position  tasks,  participants  were  more  accurate  with  the  2-D 
topographic  maps,  confirming  our  hypothesis. 


Figure  4.  A  trial  in  the  Four-Corners  task  (St.  John,  Oonk  &  Cowen,  2000).  Participants 
imagine  standing  on  the  ground  at  the  white  cross  and  looking  to  the  southeast.  They 
then  pick  the  correct  view  from  the  four  alternatives.  The  correct  answer  is  top-right. 

One  obvious  way  of  improving  localization  with  3-D  displays  is  to  increase  the  depth 
cues  in  them  (Nagata,  1993).  Static  3-D  perspective  displays  may  have  as  few  as  three  of 
the  10  cues  available  to  normal  vision  (occlusion,  linear  perspective  and  shading).  Certain 
other  cues  (e.g.  texture  gradients  and  atmospheric  haze)  increase  display  realism  (and 
hence  desirability  to  users)  but  do  not  increase  localization  performance.  For  example, 
our  own  research  has  shown  that  varying  the  relative  size  of  tracks  is  a  poor  way  to 
improve  users’  ability  to  localize  them  in  space  in  air  warfare  displays  (Smallman, 
Schiller  &  Cowen,  2000).  Further,  consider  that  if  all  10  depth  cues  were  present,  we 
would  have  achieved  the  perceptual  performance  of  regular  vision.  That  may  not  be 


something  to  be  proud  of  -  a  century  of  perceptual  work  since  Helmholtz  has  documented 
the  fallibility  of,  and  inaccuracies  inherent  in  human  depth  perception. 

Recently,  we  have  begun  to  address  the  question  of  whether  the  geometry  of  perspective 
projection  makes  3-D  perspective  displays  inherently  poorer  for  relative  position  tasks 
(Smallman,  St.  John  &  Cowen,  2002).  We  hypothesized  that  the  visual  system  can  only 
generate  precise  relative  position  estimates  from  affine-transformed  (roughly  speaking, 
linearly  transformed)  image  geometry.  When  faced  with  non-affine  transformations  (e.g. 
perspective  projections),  the  system  will  resort  to  the  use  of  the  most  linear  cues  available 
to  reconstruct  the  scene  and  these  may  be  suboptimal,  hence  deteriorating  relative 
position  performance.  Consistent  with  this  novel  theory,  we  empirically  measured  and 
then  mathematically  modeled  the  perceptual  biases  found  in  participants’  perceptual 
reconstruction  of  3-D  scenes.  Participants  reconstructed  the  length  of  10  test  posts 
scattered  across  a  3-D  scene  to  match  the  physical  length  of  a  reference  post.  The  test 
posts  were  all  oriented  in  the  X,  Y  or  Z  cardinal  directions  of  3-D  space.  Four  viewing 
angles  from  90  degrees  (“2-D”)  down  to  22.5  degrees  (“3-D”)  were  used.  Participants’ 
reconstructions  of  pole  lengths  systematically  underestimated  the  compression  of 
distances  into  the  scene  (Y)  and  systematically  overestimated  the  compression  of  height 
(Z).  The  length  mismatches  could  be  modeled  by  assuming  that  linear  perspective  (that 
only  operates  accurately  in  X)  is  inappropriately  used  to  scale  matching  lengths  in  all 
three  dimensions  of  space.  Only  the  90  degree  (2-D)  view  led  to  correct  matches  to  both 
the  X  and  Y  dimensions.  This  theory  actually  offers  a  novel  explanation  of  why 
perceived  distances  are  systematically  underestimated  in  the  real  world. 

In  sum,  when  “where”  matters  with  some  precision,  use  a  2-D  view  of  a  scene.  However, 
when  only  a  gross  sense  of  the  layout  or  shape  of  the  scene  is  required,  a  3-D  view  can  be 
useful. 

What 

In  addition  to  where  a  track  is  in  space,  there  is  the  issue  of  what  that  track  is  -  it’s 
identity.  The  AADC  display  is  popular  partly  because  it  depicts  aircraft  and  ships  as 
miniature  realistic  icons  whereas  conventional  C2  displays  show  them  as  less  familiar, 
military  symbols  (see  Figure  5). 

Using  a  battery  of  tasks  including  naming  (Smallman,  St.  John,  Oonk,  &  Cowen,  2000), 
recall  (Smallman,  Schiller,  &  Mitchell,  2000)  and  visual  search  (Smallman,  Oonk,  St. 
John,  &  Cowen,  2001)  for  standard  military  symbols  (MIL-STD-2525B,  US  Department 
of  Defense,  1996)  compared  with  realistic  icons,  we  have  found  a  fairly  consistent  pattern 
of  results.  As  we  found  in  the  Where  section  above,  the  beguiling  realism  of  3-D 
perspective  view  displays  actually  serves  to  undermine  their  utility  for  many  tasks.  An 
iconic  code  retains  a  visual  similarity  between  the  depicted  object  and  its  referent.  When 
what  has  to  be  displayed  is  a  set  of  inherently  similar  objects  (many  aircraft  look 
somewhat  alike,  as  do  many  ships)  then  users  will  have  difficulty  discriminating  their 
icons  and  will  consistently  misidentify  them  (see  Figure  5).  Abstract  symbols,  on  the 


other  hand,  can  be  made  arbitrarily  distinct.  However,  we  have  found  that  icons  are 
superior  to  symbols  for  conveying  category  (air  or  sea)  and  heading  information. 
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Figure  5.  Two  ways  of  mapping  real-world  military  platforms  onto  a  display,  left,  as 
realistic  3-D  icons,  right,  as  2-D  symbols  -  from  the  Military  Standard  2525B  symbol  set 
(from  Smallman,  Oonk,  St.  John  &  Cowen,  2000). 


The  complimentary  advantages  of  symbols  for  some  attributes  (platform  identify  and 
affiliation)  and  icons  for  others  (heading  and  platform  category)  suggested  to  us  the 
potential  of  a  new  symbology  that  combines  the  best  aspects  of  symbols  and  icons.  We 
call  this  hybrid  symbology  “Symbicons”,  see  Figure  6. 
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Figure  6.  A  fighter  Symbicon  is  created  by  combining  the  interior  of  a  conventional 
MIL-STD-2525B  symbol  with  a  discriminable,  cartooned  outline  of  a  realistic  icon. 
Symbicons  are  intended  to  combine  the  best  aspects  of  symbols  and  icons  (from 
Smallman,  St.  John,  Oonk  &  Cowen,  2001). 

In  a  visual  search  experiment,  we  established  that  Symbicons  were  as  good,  if  not  better, 
in  either  speed  or  errors,  for  all  four  of  the  asset  attributes  listed  in  Figure  6  (Smallman, 
St.  John,  Oonk  &  Cowen,  2001).  Hence,  Symbicons  were  shown  to  successfully 
combine  the  best  aspects  of  symbols  and  icons. 

In  sum,  when  “what”  matters,  discriminable  caricatures  may  be  more  effective  than  full 
realism,  even  if  full  realism  is  preferred  by  users. 

How 

C2  tasks  are  complex  and  are  likely  to  contain  task  elements  that  require  both  shape 
understanding  (better  in  3-D)  and  also  comprehending  the  relative  position  of  objects 
(better  in  2-D).  How  should  displays,  or  suites  of  displays  be  used  to  best  serve  the 
complex  task  requirements  of  C2?  To  address  this  question  empirically,  we  have 
developed  a  quasi-realistic  C2  tactical  routing  task  (the  “Antenna  task”)  that  requires 
threading  a  chain  of  antennas  across  terrain  while  remaining  out  of  line  of  sight  of  enemy 
units  (St.  John,  Smallman,  Bank  &  Cowen,  2001). 

The  Antenna  Task  is  fairly  difficult  (see  Figure  7).  It  requires  placing  a  number  of 
antennas  in  precise  locations  that  satisfy  a  large  number  of  constraints  concerning  the 
shape  of  the  terrain  and  multiple  lines  of  sight.  The  task  requires  a  good  understanding  of 
the  shape  of  the  terrain  for  finding  promising  routes  and  for  hiding  antennas,  which  we 
previously  found  to  be  easier  using  a  3-D  view.  The  Antenna  Task  also  requires  precise 
judgments  of  line  of  sight  based  on  the  relative  heights  and  distances  among  antennas  and 
the  terrain.  The  relative  benefits  of  2-D  and  3-D  views  for  this  aspect  of  the  task  is  more 


difficult  to  predict.  In  previous  work  (St.  John  et  al.,  2001),  participants  judged  whether 
two  points  on  terrain  were  in  view  of  each  other.  This  task  appeared  to  require  only  a 
very  gross  understanding  of  the  terrain  -  whether  a  large  mountain  or  range  of  hills  was 
obstructing  a  view,  and  in  fact,  a  3-D  perspective  view  proved  superior  to  a  2-D 
topographic  view.  In  contrast,  line  of  sight  judgments  in  the  antenna  task  often  require 
far  more  precision  to  determine  whether  antennas  are  just  in  or  out  of  a  line  of  sight.  This 
fine  precision  hinges  on  obtaining  precise  judgments  of  the  distances,  angles,  and  relative 
heights  of  points  on  the  terrain.  We  previously  found  such  tasks  to  be  easier  using  a  2-D 
view.  In  contrast  to  finding  generally  promising  routes,  then,  the  exact  placements  of  the 
antennas  may  benefit  from  a  2-D  view. 

We  found  that  the  Antenna  task  was  difficult  but  performed  better  with  the  2-D  view  than 
the  3-D  view.  We  believe  that  this  is  so  because  participants  were  forced  to  spend  the 
majority  of  their  time  involved  in  the  fine  placement  of  antennas  on  the  maps  which  was 
a  precise  relative  position  task. 


Figure  7.  The  2-D  plan  view  (left)  and  3-D  perspective  view  (right)  in  the  antenna 
placement  experiment.  Enemy  positions  are  identified  by  flags  with  a  red  “X”.  Antennas 
are  identified  by  flags  with  blue  circles  (from  St.  John,  Smallman,  Bank  &  Cowen,  2001). 

In  a  follow-on  experiment,  called  “pick-a-path”,  participants  were  shown  three  potential 
routes  across  the  terrain  for  constructing  their  chain  of  antennas  (St.  John,  Smallman, 
Bank  &  Cowen,  2001).  One  of  the  three  routes  was  much  more  promising  than  the  other 
two,  in  that  it  followed  canyons,  and  skirted  hill  tops  to  remain  out  of  enemy  lines  of 
sight.  Participants  were  shown  the  terrain  and  routes  in  either  2-D  topographic  views  or 
3-D  perspective  views.  Performance  using  the  3-D  perspective  views  was  much  faster. 
This  result  suggested  to  us  a  new  human  factors  design  concept  for  C2  that  we  call  Orient 
and  Operate.  Users  orient  to  the  layout  of  a  scene  using  a  3-D  view,  but  then  switch  to  2- 
D  views  to  interact  with  and  operate  on  the  scene.  A  3-D  view  may  work  best  to  gain  a 
basic  grasp  of  the  terrain,  the  shapes  and  locations  of  routes  and  objects.  However,  3-D 
may  be  too  ambiguous  and  distorted  for  precise  judgments.  Once  a  rough  sense  of  layout 
and  shape  are  obtained,  a  2-D  view  may  work  best  for  achieving  a  precise  grasp  of 
relative  positions  and  exact  shapes. 


Further  supporting  the  Orient  and  Operate  concept,  we  found  that  participants  performed 
the  best  when  provided  both  a  2-D  plan  view  and  a  3-D  perspective  view  side  by  side. 
However,  the  effect  was  of  small  magnitude  and  we  believe  that  better  configurations  of 
views  are  possible.  Our  suspicion  is  that  placing  views  side  by  side,  although  a  natural 
first  step  at  display  combination,  is  not  an  optimal  arrangement  for  creating  an  effective 
suite  of  displays.  Moving  from  one  view  to  the  other  requires  considerable  re-orientation 
to  the  scene  by  the  user.  What  are  needed  now  are  methods  for  improving  the 
correspondences  between  objects  in  the  views  that  will  alleviate  the  effects  of  re¬ 
orientation.  The  concept  of  visual  momentum  (see  Woods,  1984)  offers  ideas,  such  as 
the  use  of  natural  and  artificial  landmarks  and  consistent  and  compatible  representations 
(Wickens  and  Carswell,  1995),  for  improving  the  correspondence  between  multiple 
views.  Investigation  of  these  and  other  concepts  is  currently  underway. 


Conclusions 

3-D  perspective  view  displays  are  coming.  They  are  compelling  and  attractive  to  users 
because  of  their  realism,  but  counter  to  many  of  our  intuitions,  they  are  actually  less 
useful  for  a  range  of  C2  tasks  than  well-designed  2-D  displays.  There  is  more  to  display 
design  than  photo-realism.  Users  are  better  served  by  designers  who  consider  the  nature 
of  the  user’s  tasks  and  then  tailor  the  display  view,  symbology  and  depth  cues  to  best  suit 
those  specific  tasks.  Finally,  consider  that  without  experimental  research  programs  such 
as  the  one  reviewed  here,  users  might  be  given  3-D  perspective  displays  for  C2  tasks  that 
are  inappropriate  and  interfere  with  their  job  performance. 
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